# A tibble: 2 × 3
class precision recall
<fct> <dbl> <dbl>
1 City Pop 0.774 0.774
2 U.S. Pop 0.774 0.774
In order to compute the model, I did capped the playlists at 31 songs in each group. The accuracy of the model as of right now is by the use of the formula (TP+TN)/total according to this website is (25+23)/62 = 0.77.., thus according to various sources like this one it is a relatively good accuracy while at the same time being within a realistic interval.
By this, I can assume that the model is actually very good at classifying exactly what City Pop and what U.S Pop is. In the next header, I will see what kind of labels were the most important in the classification of these playlists, and whether my visualizations from the earlier weeks were any sort of effective.
# A tibble: 2 × 3
class precision recall
<fct> <dbl> <dbl>
1 City Pop 0.84 0.677
2 U.S. Pop 0.730 0.871
Truth
Prediction City Pop U.S. Pop
City Pop 21 4
U.S. Pop 10 27
I hope this is the right way of getting the confusion matrix out of the random forest model. If I use the same formula as in the previous slide, i will get an accuracy of 0.77…, just as the last one.
Regarding the feature selection, it is shown in the graph that the timbre vector c11, followed by the track level feature loudness, followed yet again by timbre vector c1. This is quite humorous as it was said in one of the lectures that this timbre vector is the rough equivalent to loudness. After those, primarily timbre features are the ones that are of the most importance. This is quite interesting, as I did mention at the very start of the course (without any knowledge of the terms of music) said that City Pop has a lot of layers to them. This was meant as “there is a lot going on” and it sounds different.
For the final portfolio, I will use these features in order to improve the existing visualizations I have (and I have the ability to remove some of them). Yay!
Here is the distribution of tempi within my corpus, divided between the Japanese group and the U.S. group. As the graphs show, there are distinct differences in which tempo is preferred in the western world - which one can see is around 120 bpm. Interestingly, this corresponds well to the research that has been conducted where 120-125 bpm seems to be a natural tempo for humans and a link between the natural tempo and bpm in music has been found. However, according to the corpus I am using, this is not the case for the eastern world. The distribution here is more even, with more songs being in the range of 100-135 bpm. However, there are more songs that lie around the 105 bpm mark.
Aside from the obvious archetypes in this distribution, there are two outliers from each group. These are the songs “A HOPE FROM SAD STREET” by Anri, and “Love Is a Contact Sport” by Whitney Houston. They each have tempi that lie around 180 and 175 bpm respectively.
Here is the tempogram of the indentified outlier from the graph of energy’s effect on danceability, where the energy is low, but danceability is high. This is a completely standard tempogram, in terms of the overall execution of the formulations on this piece as there are no disturbances or anything that went wrong. “Billie Jean” is a very stereotypical song in this regard, with a bpm of approximately 120 and very steady beat, rhythm and tempo that do not seem to change. This is probably a part of the reason the overall danceability of this piece is very high, in spite that its energy is very low.
Providing the distribution of tempi in the form of a histogram, there was a couple outliers in both genres. “A HOPE FROM SAD STREET” by Anri is one of them. Here, in comparison to “Billie Jean” there were apparantly some issues when generating the tempo, as there are multiple yellow lines that flicker across the entire piece. However, as one can tell, it was not a complete failure, as it was able to generate not one, but two tempo octaves. It is not as stable as the previous piece, but the lines are clearly there.
This flickering could be attributed to the way this song is realized, in that there are many layers of sound, there are the bassline, but there are also a layer of trumpet as well as a choir and points where the song is stopped and picked up again. The bassline often is replaced by piano, and guitar solos - before returning to the “status quo”. As far as I can tell, these instances are represented in the points in the tempogram.
In comparison to Anri’s song, “Love is a Contact Sport” by Whitney Houston does not have any issues at all, as clearly shown by the image. The song is also realized very clearly, with a clear tempo throughout the entire song. This song was also one of the outliers in the distribution chart, and was the western counterpart to Anri’s. Similarly to the aforementioned song, this has two tempo-octaves.
Once again, here is a song by Anri, called “CAT’S EYE - (NEW TAKE)”. I wanted to include this, due to the algorithm being able to generate the tempo very clearly in the beginning but when the song has reached around 130 seconds, there is a clear window where it was not able to generate the tempo. This is very clear if you listen to the song, as there is a moment around that time where the bass guitar has a solo. To me, it feels quicker than the song itself, which it has been able to show in some regard - however, this shift was clearly something the tempogram could not handle and was interesting to me.
As one can see from the graph here, the distribution of keys between each song in each of my corpus group are, overall, quite evenly distributed. However, one can tell that different groups often prefer different keys. City Pop seems to prefer keys that are in C, D, F, and G for the most part, whereas U.S. Pop, while sharing a decent amount of songs in C as well, seem to also prefer D#, A, and B - unlike City Pop. It should be noted however, that there are a few songs more in the city pop group than the U.S. one, so it can skew the distributions.
-PLEASE DO NOT SHOW THIS PORTFOLIO IN CLASS-
I am comparing two playlists consisting of three artists, one Japanese group and one US playlist. The Japanese group consist of the artists Taeko Onuki, Miki Matsubara and Anri. The US counterparts are the artists Michael Jackson, Whitney Houston and Madonna. I chose these corpora because I want to explore whether there are distinct differences between the genre of (city) pop as it was in Japan in the 80’s vs the pop that was popular in the western world in the same decade. Japanese city pop was influenced by western music, so I expect there to be many similarities in use of sound, instruments and type of rhythms. However, an aspect I am particularly interested whether there is a difference is the prevalence of bass, and rhythms. It is also interesting to see whether there are differences in other aspects like “supplementary” sounds. However, I am unsure to what extent they are different.
As I have chosen three artists to represent their own (variety) of genres, there might be nuances and representations I am missing. Taeko Onuki, Miki Matsubara and Anri were chosen due to their popularity on Spotify (the amount of general listeners as well as listens to their tracks). I also have to mention that there were personal selections. The same method was done in choosing the western counterparts. However, the genre(s) is (are) very broad, despite its popularity, and some varieties might have been overlooked. However, their popularity is a strength.
Typical, and popular, tracks from the Japanese playlist are:
These songs are typical in the sense that there are prominent use of basslines and clear rhythms, and have many “layers” to them.
The western counterparts have typical tracks like:
These last three tracks especially has the typical and distinct features of pop of the 80’s, namely the sharp drums and the heavily synthesized piano sounds and, what I think, an almost like a “dreamy” sound to them.
Atypical songs can include:
This was an interesting find that proved to be quite interesting in the
“Billie Jean” was listed as an outlier for the Pop group in the graph of whether energy has an effect on danceability, in the sense that Spotify notices this as a non energetic song, while having an extremely high danceability. As one can see in this chromagram, the song “Billie Jean” has several areas where the magnitude is over 0.75. A couple of patterns that arise are the use of the D and C#/Db at before 100 and 200 seconds. They form an almost pyramid shape. However, despite this, one can see that it is a very energetic song, by the use of the chromas - despite being a very slow and unenergetic song.
Here are plots to give insight in the general tempos effect on energy in the groups City Pop and US Pop. In the first histogram plot, one can see that the distribution of energy is more even for pop than for city pop. It says that it counted 15 of the data in my playlist for City Pop to have an energy of a little over 0.6.
Here is a plot of the effect of energy on danceability, with size of the plots as well as the band around the line indicating the tempo of the songs. One can see that there is more of a linear trend with US Pop, indicating that up until around energy of 0.5 there seems to be a correlation between energy and danceability. Tempo, however, do seem to have no pattern at first glance. Whereas for the City Pop playlist, there seem to be a slight curve in the beginning of the graph, but overall there is a very even trend of the effect of energy on danceability. At first glance, there also seem to be no indication that there is a trend for tempo.
However, once can tell that U.S. Pop seem to have more of a positive linear correlation between energy and danceability - more than its eastern counterpart in any case.
In both groups, there are two outliers that especially draw one’s eye - which is “Billie Jean” by Michael Jackson, and “横顔” by Taeko Onuki.
So far we can see that the differences between US Pop and City Pop, based on the corpus I am using, is that energy definitely seem to have an effect on danceability in US pop, up until a certain level. However, there are many outliers that can skew this, which will be identified shortly. Furthermore, it does not seem that tempo has a definite pattern on either energy or danceability.
Furthermore, another difference is that danceability seem to have more of an effect on the positive valence in both genres. However, one can see that there are more songs in US pop that are more danceable and that are happier than City Pop.
This makes sense, given the fact that according to the histogram, none of the city pop songs go above a certain threshold of valence, in comparison to US pop. This could indicate that city pop is generally less “happy”.
But one outlier I want to talk about in particular is the one track, in the energy, danceability and tempo plot, where the energy is quite low in comparison to other tracks, but the danceability is one of the highest in the group. While I am not sure how to point this out in the plot, I have managed to identify it as the track “Billie Jean” by Michael Jackson. A seperate chromagram has been made in order to account for this.
I seem to have been able to provide analyses on the tempi-histogram as well as some outliers in different regards for this homework. Please give some feedback in regards to whether there are some analytic parts that are missing or might be good to add. Thank you!